Recently, Robey et al. propose a notion of probabilistic robustness, which, at a high-level, requires a classifier to be robust to most but not all perturbations. They show that for certain hypothesis classes where proper learning under worst-case robustness is \textit{not} possible, proper learning under probabilistic robustness \textit{is} possible with sample complexity exponentially smaller than in the worst-case robustness setting. This motivates the question of whether proper learning under probabilistic robustness is always possible. In this paper, we show that this is \textit{not} the case. We exhibit examples of hypothesis classes $\mathcal{H}$ with finite VC dimension that are \textit{not} probabilistically robustly PAC learnable with \textit{any} proper learning rule. However, if we compare the output of the learner to the best hypothesis for a slightly \textit{stronger} level of probabilistic robustness, we show that not only is proper learning \textit{always} possible, but it is possible via empirical risk minimization.
translated by 谷歌翻译
Text-to-text generation models have increasingly become the go-to solution for a wide variety of sequence labeling tasks (e.g., entity extraction and dialog slot filling). While most research has focused on the labeling accuracy, a key aspect -- of vital practical importance -- has slipped through the cracks: understanding model confidence. More specifically, we lack a principled understanding of how to reliably gauge the confidence of a model in its predictions for each labeled span. This paper aims to provide some empirical insights on estimating model confidence for generative sequence labeling. Most notably, we find that simply using the decoder's output probabilities is not the best in realizing well-calibrated confidence estimates. As verified over six public datasets of different tasks, we show that our proposed approach -- which leverages statistics from top-$k$ predictions by a beam search -- significantly reduces calibration errors of the predictions of a generative sequence labeling model.
translated by 谷歌翻译
The inception of large language models has helped advance state-of-the-art performance on numerous natural language tasks. This has also opened the door for the development of foundation models for other domains and data modalities such as images, code, and music. In this paper, we argue that business process data representations have unique characteristics that warrant the development of a new class of foundation models to handle tasks like process mining, optimization, and decision making. These models should also tackle the unique challenges of applying AI to business processes which include data scarcity, multi-modal representations, domain specific terminology, and privacy concerns.
translated by 谷歌翻译
Observational studies have recently received significant attention from the machine learning community due to the increasingly available non-experimental observational data and the limitations of the experimental studies, such as considerable cost, impracticality, small and less representative sample sizes, etc. In observational studies, de-confounding is a fundamental problem of individualised treatment effects (ITE) estimation. This paper proposes disentangled representations with adversarial training to selectively balance the confounders in the binary treatment setting for the ITE estimation. The adversarial training of treatment policy selectively encourages treatment-agnostic balanced representations for the confounders and helps to estimate the ITE in the observational studies via counterfactual inference. Empirical results on synthetic and real-world datasets, with varying degrees of confounding, prove that our proposed approach improves the state-of-the-art methods in achieving lower error in the ITE estimation.
translated by 谷歌翻译
顺序标记是一项基本的NLP任务,构成了许多应用程序的骨干。对SEQ2SEQ模型的监督学习(如T5)在这些问题上取得了巨大的成功。但是,这些模型的培训目标与我们在实际应用中关心的指标和Desiderata之间存在显着脱节。例如,实用的序列标记应用程序可能需要优化某些Precision-Recall折衷(TOP-K预测),这与最大化金标记序列的可能性的标准目标完全不同。因此,为了弥合这一差距,我们提出了Groot,这是一个简单而有效的框架,用于生成文本序列的奖励优化。 Groot通过训练生成的顺序标记模型来工作,以将解码器输出分布与(Black-Box)奖励函数的输出分布相匹配。使用迭代培训制度,我们首先生成预测候选者,然后纠正其中的错误,最后对比这些候选者(基于其奖励价值)。正如通过四个公共基准测试的广泛实验所证明的那样,Groot显着改善了所有奖励指标。此外,Groot还导致了整体解码器分布的改善,这是由顶级$ K $候选者的质量提高所证明的。
translated by 谷歌翻译
检索演示的生成模型比独立语言模型提供了许多好处:除了对给定查询的文字答案外,它们还提供了从可更新知识库中检索到的出处项目。但是,它们也是更复杂的系统,需要处理长输入。在这项工作中,我们介绍了FID Light,以强烈提高最先进的检索功能模型的效率,同时保持相同的有效性。我们的FID光模型将信息流从编码器(分别编码段落)限制为解码器(使用串联编码表示)。此外,我们通过文本源指针通过重新排列的功能调整FID光,以提高排名最高的出处精度。我们对七个知识密集任务(KILT)的各种实验表明,FID光线始终改善了查询潜伏期和有效性之间的帕累托前沿。带有源指向的FID光设置为六个苏格兰短裙任务的新最新结果,用于合并文本生成和出处检索评估,同时保持合理的效率。
translated by 谷歌翻译
我们通过实验验证一个实时机器学习框架,能够控制拉曼放大器的泵功率值以在二维(2D)中塑造信号功率演变:频率和光纤距离。在我们的设置中,优化了四个一阶反向传输泵的功率值,以实现所需的2D功率配置文件。泵功率优化框架包括一个卷积神经网络(CNN),然后是差分进化(DE)技术,在线应用于放大器设置,以自动实现目标2D功率配置文件。可实现的2D配置文件的结果表明,该框架能够确保获得的最大绝对误差(MAE)(<0.5 dB)与获得的目标2D配置文件之间。此外,该框架在多目标设计方案中进行了测试,该方案的目标是在跨度结束时达到固定增益水平的2D配置文件,共同在整个光纤长度上进行最小的光谱游览。在这种情况下,实验结果断言,对于目标扁平增益水平的2D轮廓,当设置在泵功率值中不受物理限制时,DE获得的最大增益偏差小于1 dB。模拟结果还证明,有足够的泵功率可用,可以实现更高的目标增益水平的更好的增益偏差(小于0.6 dB)。
translated by 谷歌翻译
自然界中多元化的生态学在许多物种中具有各种形式的群体行为。蝴蝶物种是随机飞行的突出物种之一,有点有见地,并将其转化为人造隐喻将导致巨大的可能性。本文认为一种这种隐喻称为蝴蝶交配优化(BMO)。在BMO中,BFLE遵循巡逻的交配现象,并同时捕获了多模式函数的所有局部优势。为了模仿该算法,设计了一个移动机器人(BFlyBot),以满足BMO算法中BFLE的功能。此外,多Bflybot群的设计旨在像蝴蝶本质上的作用,并遵循该算法的规则。实时实验是在多动物领域的BMO算法上进行的,并将信号源视为光源。实验结果表明,BMO算法适用于检测多个信号源,其运动的变化显着,即静态和动态。在静态信号源的情况下,随着BFlybot的初始位置的不同,收敛性在时间和平稳性方面受到影响。而具有不同阶梯尺寸的实验会导致它们在机器人的执行时间和速度方面的变化。在这项工作中,在动态环境中进行了实验,在该环境中,信号源在操纵和非操作场景中的运动。 Bflybot群能够检测到单个和多信号源,在两个固定点之间在两个固定点之间进行线性移动,以圆形,向上和向下运动。评估BMO现象,各种正在进行的和前瞻性的作品,例如中海船舶检测,讨论了空中搜索应用和地震预测。
translated by 谷歌翻译
从较高的计算效率到实现新颖和复杂结构的发现,深度学习已成为设计和优化纳米光子电路和组件的有力框架。但是,数据驱动和基于勘探的机器学习策略在其对纳米光逆设计的有效性方面都有局限性。监督的机器学习方法需要大量的培训数据,以产生高性能模型,并且在设计空间的复杂性鉴于训练数据之外,难以推广。另一方面,基于无监督和强化学习的方法可以具有与之相关的非常长的培训或优化时间。在这里,我们证明了一种混合监督的学习和强化学习方法来实现纳米光子结构的逆设计,并证明这种方法可以减少训练数据的依赖性,改善模型预测的普遍性,并通过数量级缩短探索性培训时间。因此,提出的策略解决了许多现代深度学习的挑战,同时为新的设计方法开辟了大门,这些方法利用了多种机器学习算法来为光子设计提供更有效和实用的解决方案。
translated by 谷歌翻译
在电缆驱动的平行机器人(CDPR)中,单个电缆故障通常会导致整个机器人的完全故障。但是,通常可以通过重新配置框架上的电缆附件来恢复丢失的静态工作空间(由于故障)。通过将运动冗余以在实时冗余分辨率控制器中操纵的移动线性滑块的形式添加到机器人中,从而引入了此功能。提出的工作将该控制器与在线故障检测框架相结合,以开发自动任务恢复的完整失误耐受控制方案。该解决方案通过将最终效应器的姿势估计与仅依靠最终效应器信息的交互式多重模型(IMM)算法相结合,从而提供了鲁棒性。然后将故障和姿势估计方案绑定到冗余分辨率方法中,以产生无缝的自动任务(轨迹)恢复方法,以实现电缆故障。
translated by 谷歌翻译